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Abstract. Using Jefabek's framework for probabilistic reasoning, we formalize the cor- 
rectness of two fundamental RNC 2 algorithms for bipartite perfect matching within the 
theory VP V for polytime reasoning. The first algorithm is for testing if a bipartite graph 
has a perfect matching, and is based on the Schwartz-Zippel Lemma for polynomial iden- 
tity testing applied to the Edmonds polynomial of the graph. The second algorithm, 
due to Mulmuley, Vazirani and Vazirani, is for finding a perfect matching, where the key 
ingredient of this algorithm is the Isolating Lemma. 



There is a substantial literature on theories such as PV, S 1 ^, VPV, V which capture poly- 
nomial time reasoning [51 01 \TT\ [7] . These theories prove the existence of polynomial time 
functions, and in many cases they can prove properties of these functions that assert the cor- 
rectness of the algorithms computing these functions. But in general these theories cannot 
prove the existence of probabilistic polynomial time relations such at those inZPP,RP,BPP 
because defining the relevant probabilities involves defining cardinalities of exponentially 
large sets. Of course stronger theories, those which can define #P or PSPACE functions can 
treat these probabilities, but such theories are too powerful to capture the spirit of feasible 
reasoning. 

Note that we cannot hope to find theories that exactly capture probabilistic complexity 
classes such as ZPP, RP, BPP because these are 'semantic' classes which we suppose are not 
recursively enumerable (cf. [30]). Nevertheless there has been significant progress toward 
developing tools in weak theories that might be used to describe some of the algorithms in 
these classes. 



Paris, Wilkie and Woods [25] and Pudlak [26] observed that we can simulate approxi- 

mate counting in bounded arithmetic by applying variants of the weak pigeonhole principle. 
It seems unlikely that any of these variants can be proven in the theories for polynomial 
time, but they can be proven in Buss's theory 5*2 for the polynomial hierarchy. The first con- 
nection between the weak pigeonhole principle and randomized algorithms was noticed by 
Wilkie (cf. p2]), who showed that randomized polytime functions witness S^-consequences 
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of S% + sWPHP(PV) (i.e., Sf -consequences of V 1 + sWPHP(£ FP ) in our two-sorted frame- 
work), where sWPHP(PV) denotes the surjective weak pigeonhole principle for all PVfunc- 
tions (i.e. polytime functions). 

Building on these early results, Jefabek |13] showed that we can "compare" the sizes 
of two bounded P/poly definable sets within VPV by constructing a surjective mapping 
from one set to another. Using this method, Jefabek developed tools for describing algo- 
rithms in ZPP and RP. He also showed in [TJ1 Q3] that the theory VPV+ sWPHP(£ FP ) is 
powerful enough to formalize proofs of very sophisticated derandomization results, e.g. the 
Nisan-Wigderson theorem |24j and the Impagliazzo-Wigderson theorem |T2J. (Note that 
Jefabek actually used the single-sorted theory PV\ + sWPHP(PV), but these two theories 
are isomorphic.) 

In |15| . Jefabek developed an even more systematic approach by showing that for any 
bounded P/poly definable set, there exists a suitable pair of surjective "counting functions" 
which can approximate the cardinality of the set up to a polynomially small error. From 
this and other results he argued convincingly that VPV+sWPHP(jCpp) is the "right" theory 
for reasoning about probabilistic polynomial time algorithms. However so far no one has 
used his framework for feasible reasoning about specific interesting randomized algorithms 
in classes such as RP and RNC 2 . 

In the present paper we analyze (in VPV) two such algorithms using Jefabek's frame- 
work. The first one is the RNC 2 algorithm for determining whether a bipartite graph has 
a perfect matching, based on the Schwartz-Zippel Lemma \27\ [33] for polynomial identity 
testing applied to the Edmonds polynomial [9] associated with the graph. The second al- 
gorithm, due to Mulmuley, Vazirani and Vazirani [22], is in the function class associated 
with RNC 2 , and uses the Isolating Lemma to find such a perfect matching when it exists. 
Proving correctness of these algorithms involves proving that the probability of error is 
bounded above by 1/2. We formulate this assertion in a way suggested by Jefabek's frame- 
work (see Definition 12. ip . This involves defining polynomial time functions from {0, l} n 
onto {0, 1} x <3?(n), where <I>(n) is the set of random bit strings of length n which cause an 
error in the computation. We then show that VP Improves that the function is a surjection. 

Our proofs are carried out in the theory VPFfor polynomial time reasoning, without the 
surjective weak pigeonhole principle sWPHP(£pp). Jefabek used the sWPHP(£pp) principle 
to prove theorems justifying the above definition of error probability, but we do not need it 
to apply his definition. 

Many proofs concerning determinants are based on the Lagrange expansion (also known 
as the Leibniz formula) 

Det(A)= £sgn(a)I]A(i, ff (i)) 

where the sum is over exponentially many terms. Since our proofs in VPV can only use 
polynomial time concepts, we cannot formalize such proofs, and must use other techniques. 
In the same vein, the standard proof of the Schwartz-Zippel Lemma assumes that a multi- 
variate polynomial given by an arithmetic circuit can be expanded to a sum of monomials. 
But this sum in general has exponentially many terms so again we cannot directly formalize 
this proof in VPV. 
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2. Preliminaries 

2.1. Basic bounded arithmetic. The theory VP V for polynomial time reasoning used 
here is a two-sorted theory described by Cook and Nguyen [7J. The two-sorted language 
has variables x,y,z, . . . ranging over N and variables X, Y, Z, . . . ranging over finite subsets 
of N, interpreted as bit strings. Two sorted vocabulary C 2 A includes the usual symbols 
0, 1, +, •, =, < for arithmetic over N, the length function \X\ for strings, the set membership 
relation G, and string equality =2 (subscript 2 is usually omitted). We will use the notation 
X(t) for teX, and think of X(t) as the t th bit in the string X. 

The number terms in the base language C\ are built from the constants 0, 1, variables 
x,y,z, . . . and length terms \X\ using + and •. The only string terms are string variables, 
but when we extend C\ by adding string-valued functions, other string terms will be built 
as usual. The atomic formulas are t = u, X = Y, t < u, t £ X for any number terms u, t and 
string variables X,Y. Formulas are built from atomic formulas using A, V, -1 and 3x, 3X, 
Vx, VX. Bounded number quantifiers are defined as usual, and bounded string quantifier 
3X < t, if stands for 3X(\X\ < t A <p) and VX <t,<p stands for VX(|X| < t ->• p), where 
X does not appear in term t. 

T,q is the class of all /^-formulas with no string quantifiers and only bounded number 
quantifiers. S^-formulas are those of the form 3X < tip, where ip 6 and the prefix of 
the bounded quantifiers might be empty. These classes are extended to T,f (and 11^) for 
alH > 0, in the usual way. 

Two-sorted complexity classes contain relations R{x, X), where x are number arguments 
and X are string arguments. In defining complexity classes using machines or circuits, 
the number arguments are represented in unary notation and the string arguments are 
represented in binary. The string arguments are the main inputs, and the number arguments 
are auxiliary inputs that can be used to index the bits of strings. 

In the two sorted setting, we can define AC to be the class of relations R(x, X) such 
that some alternating Turing machine accepts R in time O(logn) with a constant number 
of alternations, where n is the sum of all the numbers in x and the total length of all the 
string arguments in X. Then from the descriptive complexity characterization of AC , it 
can be shown that a relation R(x, X) is in AC iff it is represented by some S^-formula 
ip(x,X). 

Given a class of relations C, we associate a class FC of string-valued functions F{x, X) 
and number functions f(x,X) with C as follows. We require that these functions to be 
p-bounded, i.e., the length of the outputs of F and / is bounded by a polynomial in x and 
\X\. Then we define FC to consist of all p-bounded number functions whose graphs are in 
C and all p-bounded string functions whose bit graphs are in C. 

We write Ttf(C) to denote the class of S^-formulas which may have function and 
predicate symbols from C U C\. A string function is S^(£)-definable if it is p-bounded 
and its bit graph is represented by a S^(£)-formula. Similarly, a number function is In- 
definable from C if it is p-bounded and its graph is represented by a S^(£)-formula. 

The theory V° for AC is the basis to develop theories for small complexity classes 
within P in [7J . The theory V° consists of the vocabulary C 2 A and axiomatized by the sets 
of 2-BASIC axioms as given in Figure [lj which express basic properties of symbols in C 2 A , 
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Figure 1: The 2-BASIC axioms 



together with the comprehension axiom schema 

E%(C 2 A )-COMP: 3X < yVz < y{X{z) o ip(zj), 

where tp G T,q(C\) and X does not occur free in ip. 

In [3 Chapter 5], it was shown that V° is finitely axiomatizable and a p-bounded 
function is in FAC° iff it is provably total in V°. A universally-axiomatized conservative 
extension V° of V° was also obtained by introducing function symbols and their defining 
axioms for all FAC° functions. 

In Chapter 9], Cook and Nguyen showed how to associate a theory VC to each 
complexity class C C P, where VC extends V° with an additional axiom asserting the 
existence of a solution to a complete problem for C . General techniques are also presented 
for defining a universally-axiomatized conservative extension VC of VC which has func- 
tion symbols and defining axioms for every function in FC, and VC admits induction on 
open formulas in this enriched vocabulary. It follows from Herbrand's Theorem that the 
provably-total functions in VC (and hence in VC) are precisely the functions in FC. Using 
this framework, Cook and Nguyen defined explicitly theories for various complexity classes 
within P. 

Since we need some basic linear algebra in this paper, we are interested in the two- 
sorted theory Vj^L and its universal conservative extension V#L from [6]. Recall that #L 
is usually defined as the class of functions / such that for some nondeterministic logspace 
Turing machine M, fix) is the number of accepting computations of M on input x. Since 
counting the number of accepting paths of nondeterministic logspace is AC°-equivalent to 
matrix powering, Vj^L was defined to be the extension of the base theory V° with an 
additional axiom stating the existence of powers A k for every matrix A over Z. The closure 
of #L under AC°-reductions is called DET. It turns out that computing the determinant of 
integer matrices is complete for DET under AC°-reductions. In fact Berkowitz's algorithm 
can be used to reduce the determinant to matrix powering. Moreover, Vj^L proves that 
the function Det, which computes the determinant of integer matrices based on Berkowitz's 
algorithm, is in the language of Vj^L. Unfortunately it is an open question whether the 
theory Vj^L also proves the cofactor expansion formula and other basic properties of deter- 
minants. However from results in [29j it follows that Vj^L proves that the usual properties 
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of determinants follow from the Cayley-Hamilton Theorem (which states that a matrix 
satisfies its characteristic polynomial). 

In this paper, we are particularly interested in the theory VP V for polytime reasoning 
Chapter 8.2] since we will use it to formalize all of our theorems. The universal theory 
VP Vis based on Cook's single-sorted theory PV [8], which was historically the first theory 
designed to capture polytime reasoning. A nice property of PV (and VPV) is that their 
universal theorems translate into families of propositional tautologies with polynomial size 
proofs in any extended Frege proof system. 

The vocabulary £fp of VPV extends that of V° with additional symbols introduced 
based on Cobham's machine independent characterization of FP [5]. Let Z <v denote the 
first y bits of Z. Formally the vocabulary £pp of VP Vis the smallest set satisfying 

(1) £fp contains the vocabulary of V° 

(2) For any two function G(x,X), H(y,x,X, Z) over £fp and a £^-term t = t(y,x,X), if 
F is defined by limited recursion from G, H and t, i.e., 



then F € £pp- 

We will often abuse the notation by letting £pp denote the set of function symbols in Cpp. 

The theory VPV can then be defined to be the theory over Cfp whose axioms are those 
of V° together with defining axioms for every function symbols in Cpp. VPV proves the 
scheme £^(£fp)- COMP and the following schemes 



where ip is any Tjq (Cfp)-foTimilaL. It follows from Herbrand's Theorem that the provably- 
total functions in VPV are precisely the functions in £fp- 

Observe that VPV extends V#L since matrix powering can easily be carried out in 
polytime, and thus all theorems of Vj^L from [6j [29] are also theorems of VPV. From 
results in |29] (see page 44 of |14j for a correction) it follows that VPV proves the Cayley- 
Hamilton Theorem, and hence the cofactor expansion formula and other usual properties 
of determinants of integer matrices. 

In our introduction, we mentioned V 1 , the two sorted version of Buss's S\ theory [JJ. 
The theory V 1 is also associated with polytime reasoning in the sense that the provably 
total functions of V 1 are FP functions, and V 1 is S^-conservative over VPV. However, 
there is evidence showing that V 1 is stronger than VPV. For example, the theory V 1 
proves the Ef-MD, Y,f-MIN and T,f-MAX schemes while VPV cannot prove these Sf 
schemes, assuming the polynomial hierarchy does not collapse [18]. In this paper we do not 
use V 1 to formalize our theorems, since the weaker theory VPV suffices for all our needs. 

2.2. Notation. If ip is a function of n variables, then let <p(ai, . . . , a n _ i, •) denote the 
function of one variable resulting from ip by fixing the first n — 1 arguments to a%, . . . , a n -i- 
We write ip : A -» $ to denote that ip is a surjection from A onto 



F(0,x,X) = G(x,X), 
F(y + 1, x , X) = H(y, x, X, F(y, x, X))^^ 



Z*(£ fP )-IND: 
Z$(£ F p)-MIN: 
^{C FP )-MAX: 



(<^(0) A\/x(ip(x) ->■ ip(x + 1))) -> \/xp{x) 
<p{y) — > 3x(tp(x) A -izb < xip(z)) 
p(0) -> 3x < y(p{x) A Sz < y{z > x A >p{z))) 
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We use [-X,Y") to denote {Z £ 7L\X < Z < Y}, i.e., the interval of integers between 
X and Y — 1, where strings code integers using signed binary notation. We also use the 
standard notation [n] to denote the set {1, . . . , n}. 

Given a square matrix M, we write M[i \j] to denote the (i, j)-minor of M, i.e., the 
square matrix formed by removing the ith row and j'th column from M. 

We write x to denote number sequence (xi, . . . ,Xf.) and write X to denote string se- 
quence (X\, . . . ,Xk). We write Xkxk and Xkxk to denote that x and X have k 2 elements 
and are treated as two-dimensional arrays (xij | 1 < i,j < k) and (Xij | 1 < i,j < k) respec- 
tively, where the elements of these two-dimensional arrays are listed by rows. Note that 
Xkxk and Xkxk can be simply encoded as integer matrices, and thus we will use matrix 
notation freely on them. 

We write the notation "(T h)" in front of the statement of a theorem to indicate that 
the statement is formulated and proved within the theory T. 

2.3. The weak pigeonhole principle. The surjective weak pigeonhole principle for a 
function F, denoted by sWPHP(i ? ), states that F cannot map [0,r^4) onto [0, (n + 1)A). 
Thus, the surjective weak pigeonhole principle for the class of VPV functions, denoted by 
sWPHP(£ FP ), is the schema 

jsWPHP(F) | F G £ F p}- 
Note that this principle is believed to be weaker than the usual surjective "strong" 
pigeonhole principle stating that we cannot map [0, A) onto [0,^4 + 1). For example, 
sWPHP(jCfp) can be proven in the theory V s (the two-sorted version of Buss's theory 
S"!) for FP S 3 reasoning (cf. [30]), but it is not known if the usual surjective pigeonhole 
principle for VPV functions can be proven within the theory |L>i ^* f° r the polynomial 
hierarchy (the two-sorted version of Buss's theory S2 ■= Uj>i ^2 m @])- 

2.4. Jefabek's framework for probabilistic reasoning. In this section, we give a brief 
and simplified overview of Jefabek's framework |13|. [T41 [T5] for probabilistic reasoning within 
VPV+ sWPHP(£fp)- For more complete definitions and results, the reader is referred to 
Jefabek's work. 

Let F(R) be a VPV 0-1 valued function (which may have other arguments). We think 
of F as defining a relation on binary numbers R. Let $(n) = {R < 2 n | F(R) = 1}. Observe 
that bounding the probability Prn<2 n [F{R) — l] from above by the ratio s/t is the same 
as showing that t ■ \<f>(n)\ < s ■ 2 n . More generally, many probability inequalities can be 
restated as inequalities between cardinalities of sets. This is problematic since even for the 
case of polytime definable sets, it follows from Toda's theorem |31| that we cannot express 
their cardinalities directly using bounded formulas (assuming that the polynomial hierarchy 
does not collapse). Hence we need an alternative method to compare the sizes of definable 
sets without exact counting. 

The method proposed by Jef abek in [131 O 02] is based on the following simple ob- 
servation: if r(n) and <l>(n) are definable sets and there is a function F mapping T(n) 
onto $(n), then the cardinality of <$(n) is at most the cardinality of T(n). Thus instead 
of counting the sets T(n) and <3?(n) directly, we can compare the sizes of T(n) and 3>(n) 
by showing the existence of a surjection F, which in many cases can be easily carried out 
within weak theories of bounded arithmetic. In this paper we will restrict our discussion to 
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the case when the sets are bounded polytime definable sets and the surjections are polytime 
functions, all of which can be defined within VPV, since this is sufficient for our results. 

The remaining challenge is then to formally verify that the definition of cardinality 
comparison through the use of surjections is a meaningful and well-behaved definition. The 
basic properties of surjections like "any set can be mapped onto itself" and "surjectivity is 
preserved through function compositions" roughly correspond to the usual reflexivity and 
transitivity of cardinality ordering, i.e., |<J>| < |3>| and |$| < |T| < |A| — > |<J>| < |A| for all 
bounded definable sets T and A. However more sophisticated properties, e.g., dichotomy 
I^H < |r| V |r| < |3>| or "uniqueness" of cardinality, turn out to be much harder to show. 

As a result, Jefabek proposed in |15| a systematic and sophisticated framework to justify 
his definition of size comparison. He observed that estimating the size of a P/poly definable 
set <3? C [0, 2 n ) within an error 2 n /poly(n) is the same as estimating Prxe[o,2 n ) [X 6 within 
an error l/poly(n), which can be solved by drawing poly(n) independent random samples 
X € [0, 2 n ) and check if X € This gives us a polytime random sampling algorithm for 
approximating the size of Since a counting argument [13] can be formalized within VPV+ 
sWPHP(£fp) to show the existence of suitable average-case hard functions for constructing 
Nisan-Wigderson generators, this random sampling algorithm can be derandomized to show 
the existence of an approximate cardinality S of $ for any given error E = 2 n /poly(n) in 
the following sense. The theory VPV+ sWPHP(,Cfp) proves the existence of S, y and a pair 
of P/poly "counting functions" (F,G) 

F : [0, y) x ($W[0,£)) ~» [0,y ■ S) 
G : [0,y(S + E)) -» [0, y) x <D 

Intuitively the pair (F,G) witnesses that S — E < |<fr| < S + E. This allows him to show 
many properties, expected from cardinality comparison, that are satisfied by his method 
within VPV + sWPHP(£fp) (see Lemmas 2.10 and 2.11 in [E]). It is worth noting that 
proving the uniqueness of cardinality within some error seems to be the best we can do 
within bounded arithmetic, where exact counting is not available. 

For the present paper, the following definition is all we need to know about Jefabek's 
framework. 

Definition 2.1. Let = {R < 2 n \ F(R) = 1}, where F(R) is a VPV function (which 
may have other arguments) and let s,t be VPV terms. Then 

Pr R<2 n [R G $(n)] ^ s/t 

means that either <I>(?i) is empty, or there exists a VPV function G(n, •) mapping the set 
[s] x 2 n onto the set [t] x <£(n). 

Since we are not concerned with justifying the above definition, our theorems can be 
formalized in VPV without sWPHP. 

3. Edmonds' Theorem 

Let G be a bipartite graph with two disjoint sets of vertices U = {u±, . . . ,u n } and V = 
{v\, . . . ,v n }. We use a pair to encode the edge {ui,Vj} of G. Thus the edge relation 
of the graph G can be encoded by a boolean matrix E nxn , where we define € E, i.e. 
E(i,j) = 1, iff {ui,Vj} is an edge of G. 



8 



D. T. M. LE AND S.A. COOK 



Each perfect matching in G can be encoded by an n x n permutation matrix M satisfying 
M(i,j) — > E(i,j) for all i,j € [n]. Recall that a permutation matrix is a square boolean 
matrix that has exactly one entry of value 1 in each row and each column and O's elsewhere. 

Let A nxn be the matrix obtained from G by letting Ai be an indeterminate j for 
all € E, and let Aij = for all (i,j) E. The matrix of indeterminates A{X) is 

called the Edmonds matrix of G, and Det(A(X)) is called the Edmonds polynomial of G. In 
general this polynomial has exponentially many monomials, so for the purpose of proving 
its properties in VPV we consider Det(A(X)) to be a function which takes as input an 
integer matrix W nxn and returns an integer Det(-A(W")). Thus Det(A(X)) = means that 
this function is identically zero. 

The following theorem draws an important connection between determinants and match- 
ings. The standard proof uses the Lagrange expansion which has exponentially many terms, 
and hence cannot be formalized in VPV. However we will give an alternative proof which 
can be so formalized. 

Theorem 3.1 (Edmonds' Theorem [9]). (VPV\~) Let Det(A(X)) be the Edmonds polyno- 
mial of the bipartite graph G. Then G has a perfect matching iff Det(A(X)) ^ (i.e. iff 
there exists an integer matrix W such that Det(A(W)) ^ 0). 

Proof. For the direction (=>) we need the following lemma. 

Lemma 3.2. (VPV\~) Det(M) £ { — 1, 1} for any permutation matrix M. 

Proof of Lemma \3.SX We will construct a sequence of matrices 

jV n ,JV n _i,...,iVi, 

where N n = M, N\ = (1), and we construct iVj_i from iV, by choosing ji satisfying N(i, ji) = 
1 and letting A^j_i = Ni[i 

From the way the matrices iVj are constructed, we can easily show by E^(/3fp) induction 
on I = n, . . . , 1 that the matrices are permutation matrices. Finally, using the cofactor 
expansion formula, we prove by S^(^pp) induction on I = 1, . . . , n that Det(A^) E {—1, 1}- 

□ 

From the lemma we see that if M is the permutation matrix representing a perfect 
matching of G, then VPV proves Det(A(M)) = Det(M) G {1,-1}, so Det(A(X)) is not 
identically 0. 

For the direction (■<=) it suffices to describe a polytime function F that takes as input 
an integer matrix B nxn = A(W), where A{X) is the Edmonds matrix of a bipartite graph 
G and W nxn is an integer value assignment, and reason in VP V that if Det(-B) ^ 0, then 
F outputs a perfect matching of G. 

Assume Det(B) ^ 0. Note that finding a perfect matching of G is the same as extracting 
a nonzero diagonal, i.e., a sequence of nonzero entries B(l, a(l)),B(2, <r(2)), . . . , B(n, cr(n)), 
where a is a permutation of the set [n] . For this purpose, we construct a sequence of matrices 

B n ,B n _i, ... ,B±, 

as follows. We let B n = B. For i = n, . . . , 2, we let -B«_i = Bi[i and the index ji is 
chosen using the following method. Suppose we already know Bi satisfying Det(-Bj) ^ 0. 
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By the cofactor expansion along the last row of Bi, 

i 

Det(5i) = Y J B i {i,j){-l) i+] Vet(B i [i\j\)- 

i=y 

Thus, since Det(-Bj) 7^ 0, at least one of the terms in the sum on the right-hand side is 
nonzero. Thus, we can choose the least index ji such that Bi(i,ji) ■ Det(Bi[i ^ 0. 

To extract the perfect matching, we let Q be an n x n matrix, where Q(i,j) = j- Then 
we construct a sequence of matrices 

Qn? Qn—li • • • j Qli 

where Q n = Q and Qi-i = Qi[i\ ji], i.e., we delete from Qi exactly the row and column we 
deleted from Bi. We define a permutation a by letting a(i) = Qi(i,ji)- Then a(i) is the 
column number in B which corresponds to column ji in Bi, and the set of edges 

{(i,a(i)) \l<i<n} 

is our desired perfect matching. □ 



4. SCHWARTZ-ZlPPEL LEMMA 

The Schwartz-Zippel Lemma |27l [3"5] is one of the most fundamental tools in the design of 
randomized algorithms. The lemma provides us a coRP algorithm for the polynomial identity 
testing problem (Pit): given an arithmetic circuit computing a multivariate polynomial 
P(X) over a field F, we want to determine if P(X) is identically zero. The Pit problem 
is important since many problems, e.g., primality testing p], perfect matching [22], and 
software run-time testing [32], can be reduced to Pit. Moreover, many fundamental results 
in complexity theory like IP = PS PACE [28] and the PCP theorem [2j[3j make heavy use of 
Pit in their proofs. The Schwartz-Zippel lemma can be stated as follows. 

Theorem 4.1 (Schwartz-Zippel Lemma). Let P{X\, . . . ,X n ) be a non-zero polynomial of 
degree D > over a field (or integral domain) F. Let S be a finite subset o/F and let R 
denote the sequence (Ri, ■ ■ ■ , R n )- Then 

Pr neSn [P(R) = 0) <|||. 

Using this lemma, we have the following coRP algorithm for the Pit problem when 
F = Z. Given a polynomial 

P(Xi, . . . , X n ) 

of degree at most D, we choose a sequence R S [0, 2D) n at random. If P is given implicitly 
as a circuit, the degree of P might be exponential, and thus the value of P(R) might require 
exponentially many bits to encode. In this case we use the method of Ibarra and Moran |10] 
and let Y be the result of evaluating P(R) using arithmetic modulo a random integer from 
the interval [l,.D fc ] for some fixed k. If Y = 0, then we report that P = 0. Otherwise, we 
report that P ^ 0. (Note that if P has small degree, then we can evaluate P(R) directly.) 

Unfortunately the Schwartz-Zippel Lemma seems hard to prove in bounded arithmetic. 
The main challenge is that the degree of P can be exponentially large. Even in the special 
case when P is given as the symbolic determinant of a matrix of indeterminates and hence 
the degree of P is small, the polynomial P still has up to n! terms. Thus, we will focus 
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on a much weaker version of the Schwartz-Zippel Lemma that involves only Edmonds' 
polynomials since this will suffice for us to establish the correctness of a FRNC 2 algorithm 
for deciding if a bipartite graph has a perfect matching. 

4.1. Edmonds' polynomials for complete bipartite graphs. In this section we will 
start with the simpler case when every entry of an Edmonds matrix is a variable, since it 
clearly demonstrates our techniques. This case corresponds to the Schwartz-Zippel Lemma 
for Edmonds' polynomials of complete bipartite graphs. 

Let A be the full re x re Edmonds' matrix A, where Aij = X^j for all 1 < i,j < n. We 
consider the case that S is the interval of integers S = [0, s) for s € N, so \S\ = s. Then 
Det(A(X)) is a nonzero polynomial of degree exactly re, and we want to show that 

77 

Pr.. 6Sn2 [Det(A(f)) = 0] * -. 

Let 

Z(n,s) := {f£ S n2 | Det(A(r)) = 0}, 

i.e., the set of zeros of the Edmonds polynomial Det(A{X)). Then by Definition 12. 1\ it 
suffices to exhibit a VPV function mapping [re] x S n onto S x Z(n,s). For this it suffices 
to give a VPV function mapping [re] x S n _1 onto Z(n,s). We will define a VPV function 

F(n,s,») : [re] x S n2 ~ l -» Z(n,s), 

so F(n, s, •) takes as input a pair (i, r), where i € [re] and r € S 1 ™ 2-1 is a sequence of re 2 — 1 
elements. 

Let B be an re x n matrix with elements from S. For i £ [re] let Bi denote the leading 
principal submatrix of B that consists of the i x i upper-left part of B. In other words, 
B n = B, and .Bj_i := for i = n, . . . ,2. The following fact follows easily from the 

least number principle T,q (Cfp)-MIN. 

Fact 4.2. ( VPV h) If Det(B) = 0, then there is i € [re] such that Det(Bj) = for all 
i < j < n, and either % = 1 or % > 1 and Det(-Bj-i) ^ 0. 

We claim that given Det(B) = and given i as in the fact, the element B(i,i) is 
uniquely determined by the other elements in B. Thus if % = 1 then B(i, i) = 0, and if i > 1 
then by the cofactor expansion of Det(£?j) along row i, 

= Det(J3i) = Bi(i, i) ■ Det(iVi) + Det(B{) (4.1) 

where B[ is obtained from Bi by setting B'(i,i) = 0. This equation uniquely determines 
Bi(i,i) because Det(2?j_i) ^ 0. 

The output of F(n, s, (i, r)) is defined as follows. Let B be the re x re matrix determined 
by the n 2 — 1 elements in f by inserting the symbol * (for unknown) in the position for 
B(i, i). Try to use the method above to determine the value of * = Bi(i, i), assuming that * 
is chosen so that Det(£>) = 0. This method could fail because Det(i?j_i) = 0. In this case, 
or if the solution to the equation (|4,ip gives a value for Bi(i,i) which is not in S, output 
the default "dummy" zero sequence nX n- Otherwise let C be B with * replaced by the 
obtained value of Bi(i,i). If Det(C) = then output C, otherwise output the dummy zero 
sequence. 
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Theorem 4.3. (VPV\~) Let A(X) be the Edmonds matrix of a complete bipartite graph 
K n ^ n . Let S denote the set [0, s). Then the function F(n,s,*) defined above is a polytime 
surjection that maps [n] x S n2 ~ 1 onto Z(n,s). 

Proof. It is easy to see that F(n, s, •) is polytime (in fact it belongs to the complexity class 
DET). To see that F is surjective, let C be an arbitrary matrix in Z(n,s), so Det(C) = 0. 
Let i £ [n] be determined by Fact 14.21 when B = C. Let f be the sequence of n 2 — 1 
elements consisting of the rows of C with C(i, i) deleted. Then the algorithm for computing 
F(n,s, (i,r)) correctly computes the missing element C(i,i) and outputs C. 

4.2. Edmonds' polynomials for general bipartite graphs. For general bipartite graphs, 
an entry of an Edmonds matrix A might be 0, so we cannot simply use leading principal 
submatrices in our construction of the surjection F. However given a sequence W nxn mak- 
ing Det(v4(VF)) ^ 0, it follows from Theorem 13.11 that we can find a perfect matching M 
in polytime. Thus, the nonzero diagonal corresponding to the perfect matching M will 
play the role of the main diagonal in our construction. The rest of the proof will proceed 
similarly. Thus, we have the following theorem. 

Theorem 4.4. (VPV \~) There is a VPV function H(n, s, A,W,») where A nxn is the 
Edmonds matrix for an arbitrary bipartite graph and W is a sequence of n 2 (binary) in- 
tegers, such that if Det(A(W)) / then H(n,s,A,W,») maps [n] x S n _1 onto {f € 
S n2 | Det(A(f)) = 0} ; where S = [0, s). 

In other words, it follows from Definition 12.11 that the function H(n,s,A,W,») in the 
theorem witnesses that 

Pr, £5 „ 2 [Det(A(f)) = 0] 

Proof. Assume Det(A(W)) ^ 0. Then the polytime function described in the proof of 
Theorem 13.11 produces an n x n permutation matrix M such that for all i,j £ [n], if 
M(i,j) = 1 then the element A(i,j) in the Edmonds matrix A is not zero. We apply the 
algorithm in the proof of Theorem 14.31 except that the sequence of principal submatrices 
of B used in Fact 14.21 is replaced by the sequence B n , B n _i, . . . , B\ determined by M as 
follows. We let B n = B, and for i = n, . . . , 2 we let .E>i_i = Bi[i where the indices 
ji are chosen the same way as in the proof of Theorem 13.11 when constructing the perfect 
matching M. □ 

We note that the mapping H(n,s,A,») in this case may not be in DET since the 
construction of M depends on the sequential polytime algorithm from Theorem 13.11 for 
extracting a perfect matching. 

4.3. Formalizing the RNC 2 algorithm for the bipartite perfect matching decision 
problem. An instance of the bipartite perfect matching decision problem is a bipartite 
graph G encoded by a matrix E nxn , and we are to decide if G has a perfect matching. 
Here is an RDET algorithm for the problem. The algorithm is essentially due to Lovasz 
[20] . From E, construct the Edmonds matrix A(X) for G and choose a random sequence 
r n xn £ [2n] n . If Det(A(r)) ^ then we report that G has a perfect matching. Otherwise, 
we report G does not have a perfect matching. 
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We claim that VPV proves correctness of this algorithm. The correctness assertion 
states that if G has a perfect matching then the algorithm reports NO with probability at 
most 1/2, and otherwise it certainly reports NO. Theorem 13.11 shows that VPV proves the 
latter. Conversely, if G has a perfect matching given by a permutation matrix M then the 
function H(n, 2n, A, M, •) of Theorem 14.41 witnesses that the probability of Det(A(f)) = 
is at most 1/2, according to Definition 12.11 where A is the Edmonds matrix for G. Hence 
VPV proves the correctness of this case too. 

Since RDET C FRNC 2 , this algorithm (which solves a decision problem) is also an RNC 2 
algorithm. 



The Hungarian algorithm is a combinatorial optimization algorithm which solves the maximum- 
weight bipartite matching problem in polytime and anticipated the later development of the 
powerful primal-dual method. The algorithm was developed by Kuhn [19], who gave the 
name "Hungarian method" since it was based on the earlier work of two Hungarian math- 
ematicians: D. Konig and J. Egervary. Munkres later reviewed the algorithm and showed 
that it is indeed polytime [23]. Although the Hungarian algorithm is interesting by itself, 
we formalize the algorithm since we need it in the VPV proof of the Isolating Lemma for 
perfect matchings in Section [6.11 

The Hungarian algorithm finds a maximum-weight matching for any weighted bipartite 
graph. The algorithm and its correctness proof are simpler if we make the two follow- 
ing changes. First, since edges with negative weights can never be in a maximum-weight 
matching, and thus can be safely deleted, we can assume that every edge has nonnegative 
weight. Second, by assigning zero weight to every edge not present, we only need to consider 
weighted complete bipartite graphs. 

Let G = (X tiJ Y, E) be a complete bipartite graph, where X = {xi | 1 < i < n} and 
Y = {yi 1 1 < i < n}, and let w be an integer weight assignment to the edges of G, where 
Wij > is the weight of the edge {xt, yj} € E. 

A pair of integer sequences u = (u{)™ =1 and v = (vj)™ =1 is called a weight cover if 



The Hungarian algorithm is based on the following important observation. 

Lemma 5.1. (VPV \~) For any matching M and weight cover (u,v), we have w{M) < 
cost(n, v). 

Proof. Since the edges in a matching M are disjoint, summing the constraints Wij < Ui + vj 



5. Formalizing the Hungarian algorithm 






(i,j)eM 

for every matching M and every weight cover (u, v) . 



□ 
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Given a weight cover (u, v) , the equality subgraph is the subgraph of G whose 

vertices are X\&Y and whose edges are precisely those {xi,yj} E E satisfying Wij = Ui + Vj. 

Theorem 5.2. (VPV\~) Let H = H^ y v be the equality subgraph, and let M be a maximum 
cardinality matching of H . Then the following three statements are equivalent 

(1) w(M) = cost(u,v). 

(2) M is a maximum-weight matching of G and (u, v) is a minimum-weight cover of G. 

(3) M is a perfect matching of the equality subgraph H . 
(cf. Appendix IA.3I for the full proof of this theorem.) 

Below we give a simplified version of the Hungarian algorithm which runs in polynomial 
time when the edge weights are small (i.e. presented in unary notation). The correctness 
of the algorithm easily follows from Theorem 15.21 

Algorithm 5.3 (The Hungarian algorithm). We start with an arbitrary weight cover (u,v) 
with small weights: e.g. let 

Ui = max{wi j 1 1 < j < n} 
and Vi = for all i € [n]. If the equality subgraph has a perfect matching M, we 
report M as a maximum-weight matching of G. Otherwise, change the weight cover (u, v) 
as follows. Since the maximum matching M is not a perfect matching of H, the Hall's 
condition fails for H. Thus it is not hard (cf. Corollary [1] from Appendix IA.2j) to construct 
in polytime a subset SCI satisfying |JV(5)| < \S\, where N(S) denotes the neighborhood 
of S. Hence we can calculate the quantity 

5 = minjiij + Vj — Wij \ X{ G S A yj N(S)}, 

and decrease Ui by 5 for all x% € S and increase Vj by 5 for all yj € N(S) without violating 
the weight cover property (|5.1j) . This strictly decreases the sum YH=i( u i + v i)- Thus this 
process can only repeat at most as many time as the initial cost of the cover (u, v) . Assum- 
ing that all edge weights are small (i.e. presented in unary), the algorithm terminates in 
polynomial time. Finally we get an equality subgraph Hfi% containing a perfect matching 
M, which by Theorem 15.21 is also a maximum-weight matching of G. 



When formalizing the Isolating Lemma for bipartite matchings, we need a VPV function 
Mwpm that takes as inputs an edge relation E nxn of a bipartite graph G and a nonnegative 
weight assignment w to the edges in E, and outputs a minimum-weight perfect matching if 
such a matching exists, or outputs to indicate that no perfect matching exists. Recall that 
the Hungarian algorithm returns a maximum-weight matching, and not a minimum-weight 
perfect matching. However we can use the Hungarian algorithm to compute Mwpm(n, E, w) 
as follows. 

Algorithm 5.4 (Finding a minimum- weight perfect matching). 
1: Let c = n ■ maxjiOij | (i,j) £ E} + 1 
2: Construct the sequence w' as follows 




c - Wij if G E 
otherwise. 



3: Run the Hungarian algorithm on the complete bipartite graph K n ^ n with weight assign- 
ment w' to get a maximum- weight matching M. 
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4: if M contains an edge that is not in E then 
5: return the empty matching 
6: else 

7: return M 
8: end if 

Note that since we assign zero weights to the edges not present and very large weights 
to other edges, the Hungarian algorithm will always prefer the edges that are present in 
the original bipartite graph. More formally for any perfect matching M and non-perfect 
matching N we have 

w'{M) > nc — n ■ maxjiUjj | G E} 

= (n — l)c + (c — n • maxjiOij | G 

= {n - l)c + 1 

> (n - l)c 

> w\N) 

The last inequality follows from the fact that ■ < c for all G N. Thus, if the 

Hungarian algorithm returns a matching M with at least one edge not in E, then the 
original graph cannot have a perfect matching. Also from the way the weight assignment 
w 1 was defined, every maximum-weight perfect matching of K n ^ n with weight assignment 
w' is a minimum-weight matching of the original bipartite graph. 

It is straightforward to check that the above argument can be formalized in VPV, so 
VPV proves the correctness of Algorithm 15.41 for computing the function Mwpm. 

6. FRNC 2 ALGORITHM FOR FINDING A BIPARTITE PERFECT MATCHING 

Below we recall the elegant FRNC 2 (or more precisely RDET) algorithm due to Mulmuley, 
Vazirani and Vazirani [22] for finding a bipartite perfect matching. Although the original 
algorithm works for general undirected graphs, we will only focus on bipartite graphs in 
this paper. 

Let G be a bipartite graph with two disjoint sets of vertices U = {u\, . . . ,u n } and 
V = {i>i, . . . , v n }. We first consider the minimum-weight bipartite perfect matching problem, 
where each edge G E is assigned an integer weight Wij > 0, and we want to a find 

a minimum- weight perfect matching of G. It turns out there is a DET algorithm for this 
problem under two assumptions: the weights must be polynomial in n, and the minimum- 
weight perfect matching must be unique. We let A(X) be an Edmonds matrix of the 
bipartite graph. Replace Xjj with Wij = 2 Wi ^ (this is where we need the weights to be 
small). We then compute Det(A(W)) using Berkowitz's FNC 2 algorithm. Assume that 
there exists exactly one (unknown) minimum-weight perfect matching M. We will show in 
Theorem 16 . 51 that w(M) is exactly the position of the least significant 1-bit, i.e., the number 
of trailing zeros, in the binary expansion of Det(A(W)). Once having w(M), we can test if 
an edge G E belongs to the unique minimum-weight perfect matching M as follows. 
Let w' be the position of the least significant 1-bit of Det(j4[i | j](W")). We will show in 
Theorem 16.61 that the edge is in the perfect matching if and only if w' is precisely 

w(M) — Wij. Thus, we can test all edges in parallel. Note that up to this point, everything 
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can be done in DET C FNC 2 since the most expensive operation is the Det function, which 
is complete for DET. 

What we have so far is that, assuming that the minimum- weight perfect matching 
exists and is unique, there is a DET algorithm for finding this minimum- weight perfect 
matching. But how do we guarantee that if a minimum- weight perfect matching exists, 
then it is unique? It turns out that we can assign every edge (ui,Vj) G E a random weight 
Wij € [2m], where m = \E\, and use the Isolating Lemma [22j to ensure that the graph has 
a unique minimum- weight perfect matching with probability at least 1/2. 

The RDET C FRNC 2 algorithm for finding a perfect matching is now complete: assign 
random weights to the edges, and run the DET algorithm for the unique minimum- weight 
perfect matching problem. If a perfect matching exists, with probability at least 1/2, this 
algorithm returns a perfect matching. 

6.1. Isolating a perfect matching . We will recall the Isolating Lemma [22], the key 

ingredient of Mulmuley-Vazirani-Vazirani FRNC 2 algorithm for finding a perfect matching. 
Let X be a set with m elements {oi, . . . ,a m } and let J 7 be a family of subsets of X. We 
assign a weight wi to each element 6 1, and define the weight of a set Y € T to be 
w(Y) := ^2 a ( zy w i- Let minimum-weight be the minimum of the weights of all the sets 
in T . Note that several sets of J~ might achieve minimum- weight. However, if minimum- 
weight is achieved by a unique Y E then we say that the weight assignment w = (lOi)^ 
is isolating for T . (Every weight assignment is isolating if \T\ < 1.) 

Theorem 6.1 (Isolating Lemma |22j). Let J- be a family of subsets of an n-element set 
X = {ai, . . . ,a m }. Let w = (wi) r j ^ =1 be a random weight assignment to the elements in X. 
Then 

Tfl 

P r tue[fc] m i s n °t isolating for J 7 ] < — . 

To formalize the Isolating Lemma in VPV it seems natural to present the family J- 
by a polytime algorithm. This is difficult to do in general (see Remark 16.31 below), so we 
will formalize a special case which suffices to formalize the FRNC 2 algorithm for finding a 
bipartite perfect matching. Thus we are given a bipartite graph G, and the family T is the 
set of perfect matchings of G. We want to show that if we assign random weights to the 
edges, then the probability that this weight assignment does not isolate a perfect matching 
is small. Note that although the family T here might be exponentially large, J- is polytime 
definable, since recognizing a perfect matching is easy. 

Theorem 6.2 (Isolating a Perfect Matching). (VPV \~) Let T be the family of perfect 
matchings of a bipartite graph G with edges E = {ex, . . . , e m }. Let w be a random weight 
assignment to the edges in E. Then 

P r «?e[fc] m * s n °t isolating for T\ ^ m/k. 

For brevity, we will call a weight assignment w "bad" if w is not isolating for T . Let 

$ := \w £ [k] m | w is bad for J 7 }. 

Then to prove Theorem 16.21 it suffices to construct a VPV function mapping [m] x [A;]™" 1 
onto Note that the upper bound m/k is independent of the size n of the two vertex sets. 
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The set 3> is polytime definable since w 6 $ iff 

3 . € / E(i,j) and M{i,j) and -.M'(i, j) and M, M' \ 
V enc °de two perfect matchings with the same weight J ' 

where M denotes the output produced by applying the Mwpm function (Algorithm 15. 4h on 
G, and M' denotes the output produced by applying Mwpm on the graph obtained from G 
by deleting the edge (i, j). 

Proof of Theorem \6.2l By Definition 12.11 we may assume that $ is nonempty, so there is 
an element 5 € <3?. (We will use 5 as a "dummy" element.) It suffices for us to construct 
explicitly a VP V function ip mapping [m] x [fe] m_1 onto 3>. For each i G [k] we interpret the 
set {i} x [/c] m_1 as the set of all possible weight assignments to the m — 1 edges E \ {e^}. 
Our function p will map each set {i} x [fc] m_1 onto the set of those bad weight assignments 
w such that the graph G contains two distinct minimum-weight perfect matchings M and 
M' with a G M \ M'. 

The function p takes as input a sequence 

{i,wi, . . .,Wi-i,w i+1 , . . .,w m ) 

from [m] x [A;]™" 1 and does the following. Use the function Mwpm (defined by Algorithm [53]) 
to find a minimum-weight perfect matching M' of G with the edge deleted. Use Mwpm 
to find a minimum-weight perfect matching M\ of the subgraph G \ {uj, vg}, where Uj and 
V£ are the two endpoints of e^. If both perfect matchings M' and M\ exist and satisfy 
w(M') — w(Mi) £ [k], then ip outputs the sequence 

(wi, . . . ,Wi-i,w(M') - ?i)(Mi),«) i+ i, . . . ,w m ) . (6.1) 

Otherwise (p outputs the dummy element 6 of 

Note that if both M' and Mi exist, then (16. ip is a bad weight assignment, since M' and 
M = Mi U {e,} are distinct minimum-weight perfect matchings of G under this assignment. 

To show that ip is surjective, consider an arbitrary bad weight assignment w = {wi) 1 ^ =1 £ 
<I>. Since w is bad, there are two distinct minimum- weight perfect matchings M and M' 
and some edge e, G M \ M' . Thus from how ip was defined, 

(i,wi,. . .,Wi-i,w i+ x, . . .,w m ) € [m] x [k] 171 " 1 

is an element that gets mapped to the bad weight assignment w. □ 

Remark 6.3. The above proof uses the fact that there is a polytime algorithm for finding 
a minimum-weight perfect matching (when one exists) in an edge-weighted bipartite graph. 
This suggests limitations on formalizing a more general version of Theorem I6.2l in VPV. For 
example, if J- is the set of Hamiltonian cycles in a complete graph, then finding a minimum 
weight member of J 7 is NP hard. 

6.2. Extracting the unique minimum-weight perfect matching. Let G be a bipartite 
graph and assume that G has a perfect matching. Then in Section 16.11 we formalized a 
version of the Isolating Lemma, which with high probability gives us a weight assignment 
w for which G has a unique minimum-weight perfect matching. This is the first step of the 
Mulmuley-Vazirani-Vazirani algorithm. Now we proceed with the second step, where we 
need to output this minimum- weight perfect matching using a DET function. 

Let B be the matrix we get by substituting Wij = 1 Wi -i for each nonzero entry of 
the Edmonds matrix A of G. We want to show that if M is the unique minimum weight 
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perfect matching of G with respect to w, then the weight w{M) is exactly the position of 
the least significant 1-bit in the binary expansion of Det(S). The usual proof of this fact is 
not hard, but it uses properties of the Lagrange expansion for the determinant, which has 
exponentially many terms and hence cannot be formalized in VPV. Our proof avoids using 
the Lagrange expansion, and utilizes properties of the cofactor expansion instead. 

Lemma 6.4. (VPVh) There is a VPV function that takes as inputs an n x n Edmonds' 
matrix A and a weight sequence 

W = (Wtj = 2 Wi * |1 <i,j <n). 

And if B = A(W) satisfies Det(i?) ^ and p is the position of the least significant 1-bit of 
Det(B), then the VPV function outputs a perfect matching M of weight at most p. 

It is worth noting that the lemma holds regardless of whether or not the bipartite 
graph corresponding to A and weight assignment W has a unique minimum-weight perfect 
matching. 

The proof of Lemma 16.41 is very similar to that of Theorem 13.11 Recall that in Theo- 
rem 13.11 given a matrix B satisfying Det(i?) ^ 0, we want to extract a nonzero diagonal 
of B. In this lemma, we are given the position p of the least significant 1-bit of Det(B), 
and we want to get a nonzero diagonal of B whose product has the least significant 1-bit at 
position at most p. For this, we can use the same method of extracting the nonzero diagonal 
from Theorem 13.11 with the following modification. When choosing a term of the Lagrange 
expansion on the recursive step, we will also need to make sure the chosen term produces a 
nonzero sub-diagonal of B that will not contribute too much weight to the diagonal we are 
trying to extract. This ensures that the least significant 1-bit of the weight of the chosen 
diagonal is at most p. 

For the rest of this section, we define numz(Y) to be the position of the least significant 
1-bit of the binary string Y. Thus if numz(Y) = q then Y = ±2 g Z for some positive odd 
integer Z. 



Proof of Lemma \6.4[ We construct a sequence of matrices 

B-ni -Bn-1, • • • , B\ 

where B n = B and B^\ = B{[i\ ji] for i = n . . . ,2 where the index ji is chosen as follows. 
Define 

Ti-.= I f[ B e (£,j £ ) \ Det(B t ). 
\t=i+l J 

Assume we are given j n , . . . ,ji+i such that numz(Tj) < p. We want to choose ji such that 
numz(Ti_i) < p, where by definition 

Ti - 1 = {fl B ^Je^j Det(iVi) = (^BeitjM Det(Bi[i| *]). 
This can be done as follows. From the cofactor expansion of Det(-Bj), we have 

T i=i2(- i y +j { n w^W^Det^iij]). 

j=l \t=i+l J 
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Since numz(Tj) < p, at least one of the terms in the sum must have its least significant 1-bit 
at position at most p. Thus, we can choose ji such that 

numz B t {£,ji)^ B^i, ji)Det(Bi[i = numz^i) 

is minimized, which guarantees that numz(Tj_i) < p. 

Since by assumption numz(T n ) = numz(Det(.B„)) = p, VP Improves by S^(£pp) induc- 
tion on % = n, . . . , 1 that 

numz(Tj) < p. 
If we define j\ = 1 , then when i = 1 we have 



Ti = (l[Be(e,ji) Det(Bi) = f[B e (£, 



a=2 



Thus it follows that numz(Ti) = numz (HJLi Be(£,ji)) < p. 

Similarly to the proof of Theorem 13. lj, we can extract a perfect matching with weight 
at most p by letting Q be a matrix, where Q(i,j) = j for all i,j £ [n]. Then we compute 
another sequence of matrices 

Q n , Qn— It ■ ■ ■ j Ql j 

where Q n = Q and Qi-\ = Qi[i \ ji], i.e., we delete from Qi exactly the row and column we 
deleted from B{. 

To prove that M = {(£,Qe(i,je)) |1 < £ < n} is a perfect matching, we note that 
whenever a pair is added to the matching M, we delete the row i and column ji, 

where ji is the index satisfying Qi(i,ji) = k. So we can never match any other vertex to k 
again. 

It remains to show that w{M) < p. Since 



jB i (£,j l ) = 2 



w(M) 



the binary expansion of Y\i=i Bi(£, ji) has a unique one at position w{M) and zeros else- 
where. Thus it follows from how the matching M was constructed that w(M) < p. □ 

The next two theorems complete our description and justification of our RDET algo- 
rithm for finding a perfect matching. For these theorems we are given a bipartite graph 
G = (U l±l V,E), where we have U = {u\, . . . ,u n } and V = {v\, . . . ,v n }, and each edge 
£ E is assigned a weight Wij such that G has a unique minimum-weight perfect match- 
ing (see Theorem l6.2p . Let W nxn be a sequence satisfying Wy = 2 u,4 j for all (z, j) € .E. Let 
A be the Edmonds matrix of G, and let B = A(W). Let M denote the unique minimum 
weight perfect matching of G. 

Theorem 6.5. (VPVh) The weight p = w(M) is exactly numz(Det(S)). 

If in Lemma 16.41 we tried to extract an appropriate nonzero diagonal of B using the 
determinant and minors of B as our guide, then in the proof of this theorem we do the 
reverse. From a minimum-weight perfect matching M of G, we want to rebuild in polyno- 
mially many steps suitable minors of B until we fully recover the determinant of B. We can 
then prove by S^(£pp) induction that in every step of this process, each "partial determi- 
nant" of B has the least significant 1-bit at position p. Note that the technique we used to 
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prove this theorem does have some similarity to that of Lemma 13.21 even though the proof 
of this theorem is more complicated. 

Proof. Let Q be a matrix, where Q(i,j) = j for all i,j S [n]. For 1 < i < n let Bi be the 
result of deleting rows i + 1, . . . , n and columns M(i + 1), . . . , M{n) from B and let Qi be 
Q with the same rows and columns deleted. We can construct these matrices inductively 
in the form of two matrix sequences 

Bni B n —±, ••• 5 B\ Qm Qn—li ■ • • i Ql 

where 

• we let B n = B and Q n = Q, and 

• for i = n,n — 1, . . . , 2, define ji to be the unique index satisfying 

M(i) = Qi(i,ji), 
and then let B^i = Bi[i \ji] and Qi-i = Qi[i\ji\. 
Then (setting j\ = 1) 

j;) = B(i,M(i)) = 2 m '-»(.), 1 < j < n (6.2) 
Claim: numz(Det(B;)) = J2\=i w £,M(e) for all i G [n]. 

The theorem follows from this by setting i = n. We will prove the claim by induction 
on i. The base case i = 1 follows from (|6.2p . 
For the induction step, it suffices to show 

numz(Det(S i+ i)) = numz(Det( J B i )) + w i+1M ( i+1) 

From the cofactor expansion formula we have 

i+l 

Det(fli+i) = ^(-l) (m)+i i? m (i + lJ)Det(B i+1 [i + 
i=i 

Since B i+1 (i+l, j i+x ) = 2 w ^ M d+i) by ([62D, and Det(B i+1 [i+l | = Det(Bi), it suffices 

to show that if 

R := B i+1 {i + l,j m )Det(£ m [i + 

then 

numz(i?) < numz(B t+1 (i + l,j)Det(B i+1 [i + l\j})) 

for all j / 

Suppose for a contradiction that there is some j' ^ ji such that 

numz(B i+ i(i + l,/)Det(£ i+ i[i + 1|/])) < numz(i?). 

Then, we can extend the set of edges 

{(n, M(n)), . . . , (t + 2, M(* + 2)), (t + 1,/)} 

with i edges extracted from Bi + \ [i + 1 1 j'] (using the method from Lemma 16. 4p to get 
a perfect matching of G with weight at most p, which contradicts that M is the unique 
minimum-weight perfect matching of G. □ 
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To extract the edges of M in DET, we need to decide if an edge belongs to the 
unique minimum-weight perfect matching M without knowledge of other edges in M. The 
next theorem, whose proof follows directly from Lemma [6. 41 and Theorem 16. 5\ gives us that 
method. 

Theorem 6.6. (VPV\~) For every edge € E, we have £ M if and only if 

w(M) - w i:j = numz(Det(B[i | j})) . 

Proof. (=>): Assume 6 M. Then the bipartite graph G' = G \ {ui,Vj} must have a 
unique minimum- weight perfect matching of weight w{M) — Wij. Thus from Theorem 16.51 

numz(Det(B[i|i])) =w{M)-w ijj . 
(<=): We prove the contrapositive. Assume M. Suppose for a contradiction 

that 

w(M) -Wij = numz(Det(B[i| j])). 
Then by Lemma 16.41 we can extract from the submatrix B[i \ j] a perfect matching Q of the 
bipartite graph G' = G\{ui,Vj} with weight at most w(M)—Wij. But then M' = QL){(i, j)} 
is another perfect matching of G with w(M') < w(M), a contradiction. □ 

Theorems 16.21 16.51 an d 16.61 complete the description and justification of our RDET 
algorithm for finding a perfect matching in a bipartite graph. Since these are theorems of 
VPV, it follows that VP V proves the correctness of the algorithm. 

6.3. Related bipartite matching problems. The correctness of the Mulmuley-Vazirani- 
Vazirani algorithm can easily be used to establish the correctness of RDET algorithms for 
related matching problems, for example, the maximum (cardinality) bipartite matching 
problem and the minimum-weight bipartite perfect matching problem, where the weights 
assigned to the edges are small. We refer to |22| for more details on these reductions. 

7. Conclusion and Future Work 

We have only considered randomized matching algorithms for bipartite graphs. For general 
undirected graphs, we need Tutte's matrix (cf. [22]), a generalization of Edmonds' matrix. 
Since every Tutte matrix is a skew symmetric matrix where each variable appears exactly 
twice, we cannot directly apply our technique for Edmonds' matrices, where each variable 
appears at most once. However, by using the recursive definition of the Pfaffian instead of 
the cofactor expansion, we believe that it is also possible to generalize our results to general 
undirected graphs. We also note that the Hungarian algorithm only works for weighted 
bipartite graphs. To find a maximum-weight matching of a weighted undirected graph, we 
need to formalize Edmonds' blossom algorithm (cf. [16]). Once we have the correctness 
of the blossom algorithm, the proof of the Isolating Lemma for undirected graph perfect 
matchings will be the same as that of Theorem 16.21 We leave the detailed proofs for the 
general undirected graph case for future work. 

It is worth noticing that symbolic determinants of Edmonds' matrices result in very 
special polynomials, whose structures can be used to define the VPVsurjections witnessing 
the probability bound in the Schwartz-Zippel Lemma as demonstrated in this paper. It 
remains an open problem whether we can prove the full version of the Schwartz-Zippel 
Lemma using Jefabek's method within the theory VPV. 
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We have shown that the correctness proofs for several randomized algorithms can be 
formalized in the theory VP V for polynomial time reasoning. But some of the algorithms 
involved are in the subclass DET of polynomial time, where DET is the closure of #L 
(and also of the integer determinant function Det) under AC reductions. As mentioned 
in Section [2.11 the subtheory V#L of VPV can define all functions in DET, but it is open 
whether V#L can prove properties of Det such as the expansion by minors. However the 
theory Vj^L + CH can prove such properties, where CH is an axiom stating the Cayley- 
Hamilton Theorem. Thus in the statements of Lemma 13.21 and Theorem 14.31 we could have 
replaced (VPV \~) by (V#L + CH h). We could have done the same for Theorem 14.41 if 
we changed the argument W of the function H to M, where M is a permutation matrix 
encoding a perfect matching for the underlying bipartite graph. This modified statement of 
the theorem still proves the interesting direction of the correctness of the bipartite perfect 
matching algorithm in Section 14.31 since the function H is used only to bound the error 
assuming that G does have a perfect matching. 

We leave open the question of whether any of the other correctness proofs can be 
formalized in T#L + CH. 

We believe that Jefabek's framework deserves to be studied in greater depth since it 
helps us to understand better the connection between probabilistic reasoning and weak 
systems of bounded arithmetic. We are working on using Jefabek's ideas to formalize 
constructive aspects of fundamental theorems in finite probability in the spirit of the recent 
beautiful work by Moser and Tardos [21], Impagliazzo and Kabanets [11] . etc. 

Appendix A. Formalizing the Hungarian algorithm 

Before proceeding with the Hungarian algorithm, we need to formalize the two most fun- 
damental theorems for bipartite matching: Berge's Theorem and Hall's Theorem. 

A.l. Formalizing Berge's Theorem and the augmenting-path algorithm. Let G = 

(X 1+) Y, E) be a bipartite graph, where X = {x, 1 1 < i < n} and Y = {y, | 1 < i < n}. 
Formally to make sure that X and Y are disjoint, we can let Xi := i and yi := n + i. We 
encode the edge relation E of G by a matrix E nxn , where E(i,j) = 1 iff Xj is adjacent to 
yj. Note that we often abuse notation and write {u,v} 6 E to denote that u and v are 
adjacent in G, which formally means either 

u £ X A v eY A E(u, v - n), or 

v e X A u eY A E(v, u - n). 

This complication is due to the usual convention of using an n x n matrix to encode the 
edge relation of a bipartite graph with 2n vertices. 

An n x n matrix M encodes a matching of G iff M is a permutation matrix satisfying 

Vt,j G [n],M{i,j) ^ E(iJ). 

We represent a path by a sequence of vertices (v±, . . . ,Vk) with {vi, Vj+i} € E for all 
i e [k]. 

Given a matching M, a vertex v is M-saturated if v is incident with an edge in M. 
We will say v is M-unsaturated if it is not M-saturated. A path P = {v\, . . . ,Vk) is an 
M- alternating path if P alternates between edges in M and edges in E\M. More formally, 
P is an M- alternating path if either of the following two conditions holds: 
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• For every i G {1, — 1}, {vi, G i? \ M if i is odd, and {vi, fi+i} G M if & 
is even. 

• For every i G {1, . . . , k — 1}, Wi+i} G -E \ M if i is even, and {vi, t>i+i} G M if i 
is odd. 

An M-alternating path (v\, . . . ,Vk) is an M- augmenting path if the vertices v\ and 
are M-unsaturated. 

Theorem A.l (Berge's Theorem). (VPV\~) Let G = (X\±iY,E) be a bipartite graph. A 
matching M is maximum iff there is no M -augmenting path in G. 

Proof. (=>): Assume that all matchings N of E satisfy \N\ < \M\. Suppose for a contradic- 
tion that there is an M-augmenting path P. Let M © P denote the symmetric difference 
of two sets of edges M and P. Then M' = M © P is a matching greater than M, a 
contradiction. 

(<^=): We will prove the contrapositive. Assume there is another matching M' satisfying 
\M'\ > \M\. We want to construct an M-augmenting path in G. 

Consider Q = M' ®M. Since \M'\ > \M\, it follows that \M'\M\ > \M\M'\, and 

thus 

\QnM'\>\QnM\ (A.l) 

Note that we can compute cardinalities of the sets directly here since all the sets we are 
considering here are small. Now let H be the graph whose edge relation is Q and whose 
vertices are simply the vertices of G. We then observe the following properties of H: 

• Since Q is constructed from two matchings M and M', every vertex of H can only 
be incident with at most two edges: one from M and another from M'. So every 
vertex of H has degree at most 2. 

• Any path of H must alternate between the edges of M and M'. 

We will provide a polytime algorithm to extract from the graph H an augmenting path with 
respect to M, which gives us the contradiction. 

1: Initialize K = H and i = 1 

2: while K ^ do 

3: Pick the least vertex v G K 

4: Compute the connected component Cj containing v 
5: if Ci is an M-augmenting path then 
6: return Cj and halt. 
7: end if 

8: Update K = K\d and i = i + l. 
9: end while 

Note that since H has n vertices, the while loop can only iterate at most n times. It 
only remains to show the following. 

Claim: The algorithm returns an M-augmenting path assuming \M'\ > |M|. 

Suppose for a contradiction that the algorithm would never produce any M-augmenting 
path. Since H has degree at most two, in every iteration of the while loop, we know that 
the connected component Ci is 

• either a cycle, which means |Cj PI M| = |Cj n M'|, or 

• a path but not an M-augmenting path, which implies that \Ci n M| > \d D M'\. 
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Since Q = M' (B M = {J i d and all C% are disjoint, we have 

\Q n M| = I [Jid n M)\ > | U(ci n M')| = \Q n m'|. 

i i 

But this contradicts (lA.ip . □ 

Algorithm A. 2 (The augmenting-path algorithm). As a corollary of Berge's Theorem, we 
have the following simple algorithm for finding a maximum matching of a bipartite graph 
G. We start from any matching M of G, say empty matching. Repeatedly locate an M- 
augmenting path P and augment M along P and replace M by the resulting matching. 
Stop when there is no M-augmenting path. Then we know that M is maximum. Thus, it 
remains to show how to search for an M-augmenting path given a matching M of G. 

Algorithm A. 3 (The augmenting-path search algorithm). First, from G and M we con- 
struct a directed graph H, where the vertices Vh of H are exactly the vertices X tbl Y of G, 
and the edge relation Eh of H is a 2n x 2n matrix defined as follows: 

E H := {(x,y) eX xY\ {x, y} G E \ M } U {(y, x) G Y x X \ {y, x} G M}. 

The key observation is that t is reachable from s by an M-alternating path in the bipartite 
graph G iff t is reachable from s in the directed graph H. 

After constructing the graph H, we can search for an M-augmenting path using the 
breadth first search algorithm as follows. Let s be an M-unsaturated vertex in X. We con- 
struct two 2nx2n matrices S and T as follows. 
1: The row Row(l, S) of S encodes the set {s}, the starting point of our search. 
2: From Row(i,5'), the set Row(i + 1,5) is defined as follows: j € Row(f + 1,5) there 
exists some k G [n], 

Rcw(i,5)(fc) A E H (k,j) AWe[i],^Row(£,S)(k). 

After finish constructing Row(i + 1, 5), we can update T by setting T(j, fc) to 1 for every 
k € Row(i,5) and j € Row(i + 1, 5) satisfying En(k,j). 

Intuitively, Row(i, 5) encodes the set of vertices that are of distance i — 1 from s, and T is 
the auxiliary matrix that can be used to recover a path from s to any vertex j 6 Row(z, 5) 
for all i € [2n]. 

Remark A. 4. Let R = (J™ =2 R° w (^ 5), then the set of vertices reachable from s by an M- 
alternating path. This follows from the fact that our construction mimics the construction 
of the formula 5cONNj which was used to define the theory VNL for NL in Chapter 9]. 
The matrix T is constructed to help us trace a path for every vertex v € R to s. 

By induction on i, we can prove that Row(i, 5) C X for every odd i G [2n], and 
Row(i, 5) C Y for every even i G [2n]. Thus, after having 5 and T, we choose the largest 
even i* such that Row(i*,5) 7^ 0, and then search the set Row(i*,5) n Y for an M- 
unsaturated vertex t. If such vertex t exists, then we use T to trace back a path to s. This 
path will be our M-augmenting path. If no such vertex t exists, we report that there is no 
M-augmenting path. 
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A. 2. Formalizing Hall's Theorem . 

Theorem A.5 (Hall's Theorem). (VPV\~) Let G = (X \&Y,E) be a bipartite graph. Then 
G has a perfect matching iff for every subset 5CI, |5| < |iV(5)|, where N{S) denotes the 
neighborhood of S in G. 

The condition MS C X, \S\ < \N(S)\, which is necessary and sufficient for a bipartite 
graph to have a perfect matching, is called Hall's condition. We encode a set 5CXin the 
theorem as a binary string of length n, where S(i) = 1 iff x\ G S. Similarly, we encode the 
neighborhood N(S) as a binary string of length n, and we define 

N(S) :=U{RoM(i,E)\xi€S}, 
where the union can be computed by taking the disjunction of all binary vectors in the set 

{Row(i,£) \ Xi G S} 

componentwise. Note that we can compute the cardinalities of S and N(S) directly since 
both of these sets are subsets of small sets X and Y . 

Proof. (=>): Assume M is a perfect matching of G. Given a subset S C X, the vertices 
in S are matched to the vertices in some subset T C Y by the perfect matching M, where 
\S\ = \T\. Since T C N(S), we have \N(S)\ > \T\ = \S\. 

(<=): We will prove the contrapositive. Assume G does not have a perfect matching, 
we want to construct a subset SCI such that |iV(5)| < \S\. 

Let M be a maximum but not perfect matching constructed by the augmenting-path 
algorithm. Since M is not a perfect matching, there is some M-unsaturated vertex s£l. 
Let S and T be the result of running the "augmenting-path search" algorithm from s, then 
R := UIL2 R° w (^> S) is the set of all vertices reachable from s by an M-alternating path. 
Since there is no M-augmenting path, all the vertices in R are M-saturated. We want to 
show the following two claims. 

Claim 1: The vertices in R n X are all matched to the vertices in R n Y by M, and 

\Rnx\ = \rc\y\. 

Suppose for a contradiction that some vertex v 6 R is not matched to any vertex u G R 
by M. Since we already know that all vertices in R are M-saturated, v is matched by some 
vertex w R by M. But this is a contradiction since w must be reachable from s by an 
alternating path, and so the augmenting-path search algorithm must already have added w 
to R. Thus, the vertices in R n X are all matched to the vertices in R n Y by M, which 
implies that \R n X\ = \R n Y\. 

Claim 2: N(R D X) = RnY. 

Since Rn X are matched to R fl Y, we know N(R n A) D i? n Y". Suppose for a 
contradiction that A(i? n X) D RnY. Let u G N(R D X) \ R D Y, and u G R D X be 
the vertex adjacent to u. Since u is reachable from s by an M-alternating path P, we can 
extend P to get an M-alternating path from s to v, which contradicts that v is not added 
to R. 

We note that N({s} U (R D X)) = RnY . Then S = {s} U (R n A) is the desired set 
since 

|jV(5)| = |-RnY"| = |i?n A| < |5|. 

□ 
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From the proof of Hall's Theorem, we have the following corollary saying that if a 
bipartite graph does not have a perfect matching, then we can find in polytime a subset of 
vertices violating Hall's condition. 

Corollary 1. (VPVh) There is a VP V function that, on input a bipartite graph G that 
does not have a perfect matching, outputs a subset S C X such that \S\ > \N(S)\. 



A. 3. Proof of Theorem 15. 21 . Let H = H$tf be the equality subgraph for the weight cover 
(u,v), and let M be a maximum cardinality matching of H. Recall Theorem 15.21 wants us 
to show that VP V proves equivalence of the following three statements: 

(1) w{M) = cost (it, iT) 

(2) M is a maximum- weight matching and the cover (u, v) is a minimum-weight cover of G 

(3) M is a perfect matching of H 

Proof of Theorem \5.2l (1)=>(2): Assume that cost (it, v) = w(M). By Lemma 15. 1\ no 
matching has weight greater than cost (it, v), and no cover with weight less than w(M). 

(2) =>(3): Assume M is a maximum-weight matching and (u,v) is a minimum-weight 
cover of G. Suppose for a contradiction that the maximum matching M is not a perfect 
matching of H. We will construct a weight cover whose cost is strictly less than cost(it, v), 
which contradicts that (it, v) is a minimum- weight cover. 

Since the maximum matching M is not a perfect matching of H, by Corollary [U we 
can construct in polytime a subset SCI satisfying 

\N(S)\ < \S\. 

Then we calculate the quantity 

5 = mm{ui + Vj — Wij \xi E S A yj N(S)}. 

Note that 5 > since H is the equality subgraph. Next we construct a pair of sequences 
u' = {u'j)™ =l and v' = (^)™ =1 , as follows: 

, \Ui — 5 if Xi S 5 , J Uj + 5 if G iV(S') 

Ui ~ \«i if xi 5 ^ ~ \^ if % AT(5) 

We claim that (u' , v 1 ) is again a weight cover. The condition Wij < + i>j might only be 
violated for Xi & S and ^ N(S). But since we chose 5 < itj + Vj — Wij, it follows that 

Wi,j < (ui - <5) + «j = + v'j. 

Since 

costly) = Er=iK + «D 

= Z?=M + v i ) + 5\N(S)\-5\S\, 

\ ' s ■« 

=cost(«,i?) <0 

it follows that cost(u', ?/) < cost(ix, ?/). 

(3) =>(1): Suppose M is a perfect matching of H. Then Wij = Ui + holds for all 
edges in M. Summing equalities Wij = Ui + Vj over all edges of M yields the equality 
cost(«, v) = w{M). □ 
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